66 research outputs found

    Ord2Seq: Regarding Ordinal Regression as Label Sequence Prediction

    Full text link
    Ordinal regression refers to classifying object instances into ordinal categories. It has been widely studied in many scenarios, such as medical disease grading, movie rating, etc. Known methods focused only on learning inter-class ordinal relationships, but still incur limitations in distinguishing adjacent categories thus far. In this paper, we propose a simple sequence prediction framework for ordinal regression called Ord2Seq, which, for the first time, transforms each ordinal category label into a special label sequence and thus regards an ordinal regression task as a sequence prediction process. In this way, we decompose an ordinal regression task into a series of recursive binary classification steps, so as to subtly distinguish adjacent categories. Comprehensive experiments show the effectiveness of distinguishing adjacent categories for performance improvement and our new approach exceeds state-of-the-art performances in four different scenarios. Codes are available at https://github.com/wjh892521292/Ord2Seq.Comment: Accepted by ICCV202

    OneSeg: Self-learning and One-shot Learning based Single-slice Annotation for 3D Medical Image Segmentation

    Full text link
    As deep learning methods continue to improve medical image segmentation performance, data annotation is still a big bottleneck due to the labor-intensive and time-consuming burden on medical experts, especially for 3D images. To significantly reduce annotation efforts while attaining competitive segmentation accuracy, we propose a self-learning and one-shot learning based framework for 3D medical image segmentation by annotating only one slice of each 3D image. Our approach takes two steps: (1) self-learning of a reconstruction network to learn semantic correspondence among 2D slices within 3D images, and (2) representative selection of single slices for one-shot manual annotation and propagating the annotated data with the well-trained reconstruction network. Extensive experiments verify that our new framework achieves comparable performance with less than 1% annotated data compared with fully supervised methods and generalizes well on several out-of-distribution testing sets

    ME-GAN: Learning Panoptic Electrocardio Representations for Multi-view ECG Synthesis Conditioned on Heart Diseases

    Full text link
    Electrocardiogram (ECG) is a widely used non-invasive diagnostic tool for heart diseases. Many studies have devised ECG analysis models (e.g., classifiers) to assist diagnosis. As an upstream task, researches have built generative models to synthesize ECG data, which are beneficial to providing training samples, privacy protection, and annotation reduction. However, previous generative methods for ECG often neither synthesized multi-view data, nor dealt with heart disease conditions. In this paper, we propose a novel disease-aware generative adversarial network for multi-view ECG synthesis called ME-GAN, which attains panoptic electrocardio representations conditioned on heart diseases and projects the representations onto multiple standard views to yield ECG signals. Since ECG manifestations of heart diseases are often localized in specific waveforms, we propose a new "mixup normalization" to inject disease information precisely into suitable locations. In addition, we propose a view discriminator to revert disordered ECG views into a pre-determined order, supervising the generator to obtain ECG representing correct view characteristics. Besides, a new metric, rFID, is presented to assess the quality of the synthesized ECG signals. Comprehensive experiments verify that our ME-GAN performs well on multi-view ECG signal synthesis with trusty morbid manifestations

    Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification

    Full text link
    Deep learning approaches exhibit promising performances on various text tasks. However, they are still struggling on medical text classification since samples are often extremely imbalanced and scarce. Different from existing mainstream approaches that focus on supplementary semantics with external medical information, this paper aims to rethink the data challenges in medical texts and present a novel framework-agnostic algorithm called Text2Tree that only utilizes internal label hierarchy in training deep learning models. We embed the ICD code tree structure of labels into cascade attention modules for learning hierarchy-aware label representations. Two new learning schemes, Similarity Surrogate Learning (SSL) and Dissimilarity Mixup Learning (DML), are devised to boost text classification by reusing and distinguishing samples of other labels following the label representation hierarchy, respectively. Experiments on authoritative public datasets and real-world medical records show that our approach stably achieves superior performances over classical and advanced imbalanced classification methods.Comment: EMNLP 2023 Findings. Code: https://github.com/jyansir/Text2Tre

    GCL: Gradient-Guided Contrastive Learning for Medical Image Segmentation with Multi-Perspective Meta Labels

    Full text link
    Since annotating medical images for segmentation tasks commonly incurs expensive costs, it is highly desirable to design an annotation-efficient method to alleviate the annotation burden. Recently, contrastive learning has exhibited a great potential in learning robust representations to boost downstream tasks with limited labels. In medical imaging scenarios, ready-made meta labels (i.e., specific attribute information of medical images) inherently reveal semantic relationships among images, which have been used to define positive pairs in previous work. However, the multi-perspective semantics revealed by various meta labels are usually incompatible and can incur intractable "semantic contradiction" when combining different meta labels. In this paper, we tackle the issue of "semantic contradiction" in a gradient-guided manner using our proposed Gradient Mitigator method, which systematically unifies multi-perspective meta labels to enable a pre-trained model to attain a better high-level semantic recognition ability. Moreover, we emphasize that the fine-grained discrimination ability is vital for segmentation-oriented pre-training, and develop a novel method called Gradient Filter to dynamically screen pixel pairs with the most discriminating power based on the magnitude of gradients. Comprehensive experiments on four medical image segmentation datasets verify that our new method GCL: (1) learns informative image representations and considerably boosts segmentation performance with limited labels, and (2) shows promising generalizability on out-of-distribution datasets

    Doctor Imitator: Hand-Radiography-based Bone Age Assessment by Imitating Scoring Methods

    Full text link
    Bone age assessment is challenging in clinical practice due to the complicated bone age assessment process. Current automatic bone age assessment methods were designed with rare consideration of the diagnostic logistics and thus may yield certain uninterpretable hidden states and outputs. Consequently, doctors can find it hard to cooperate with such models harmoniously because it is difficult to check the correctness of the model predictions. In this work, we propose a new graph-based deep learning framework for bone age assessment with hand radiographs, called Doctor Imitator (DI). The architecture of DI is designed to learn the diagnostic logistics of doctors using the scoring methods (e.g., the Tanner-Whitehouse method) for bone age assessment. Specifically, the convolutions of DI capture the local features of the anatomical regions of interest (ROIs) on hand radiographs and predict the ROI scores by our proposed Anatomy-based Group Convolution, summing up for bone age prediction. Besides, we develop a novel Dual Graph-based Attention module to compute patient-specific attention for ROI features and context attention for ROI scores. As far as we know, DI is the first automatic bone age assessment framework following the scoring methods without fully supervised hand radiographs. Experiments on hand radiographs with only bone age supervision verify that DI can achieve excellent performance with sparse parameters and provide more interpretability.Comment: Original Title: "Doctor Imitator: A Graph-based Bone Age Assessment Framework Using Hand Radiographs" @inproceedings{chen2020doctor, title={Doctor imitator: A graph-based bone age assessment framework using hand radiographs}, author={Chen, Jintai and Yu, Bohan and Lei, Biwen and Feng, Ruiwei and Chen, Danny Z and Wu, Jian}, booktitle={MICCAI}, year={2020}

    Clean air for some : Unintended spillover effects of regional air pollution policies

    Get PDF
    China has enacted a number of ambitious pollution control policies to mitigate air pollution in urban areas. Unintended side effects of these policies to other environmental policy arenas and regions have largely been ignored. To bridge this gap, we use a multiregional input-output model in combination with an atmospheric chemical transport model to simulate clean air policy scenarios and evaluate their environmental impacts on primary PM2.5 and secondary precursor emissions, as well as CO2 emissions and water consumption, in the target region and spillover effects to other regions. Our results show that the reduction in primary PM2.5 and secondary precursor emissions in the target regions comes at the cost of increasing emissions especially in neighboring provinces. Similarly, co-benefits of lower CO2 emissions and reduced water consumption in the target region are achieved at the expense of higher impacts elsewhere, through outsourcing production to less developed regions in China

    Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models

    Full text link
    Large language models (LLMs) have achieved remarkable advancements in the field of natural language processing. However, the sheer scale and computational demands of these models present formidable challenges when considering their practical deployment in resource-constrained contexts. While techniques such as chain-of-thought (CoT) distillation have displayed promise in distilling LLMs into small language models (SLMs), there is a risk that distilled SLMs may still carry over flawed reasoning or hallucinations inherited from their LLM counterparts. To address these issues, we propose a twofold methodology: First, we introduce a novel method for distilling the self-evaluation capability inherent in LLMs into SLMs, which aims to mitigate the adverse effects of erroneous reasoning and reduce hallucinations. Second, we advocate for a comprehensive distillation process that incorporates multiple distinct chain-of-thought and self-evaluation paradigms and ensures a more holistic and robust knowledge transfer into SLMs. Experiments on three NLP benchmarks demonstrate that our method significantly improves the performance of distilled SLMs and sheds light on the path towards developing smaller models closely aligned with human cognition.Comment: 13 pages, 5 figure

    PGformer: Proxy-Bridged Game Transformer for Multi-Person Extremely Interactive Motion Prediction

    Full text link
    Multi-person motion prediction is a challenging task, especially for real-world scenarios of densely interacted persons. Most previous works have been devoted to studying the case of weak interactions (e.g., hand-shaking), which typically forecast each human pose in isolation. In this paper, we focus on motion prediction for multiple persons with extreme collaborations and attempt to explore the relationships between the highly interactive persons' motion trajectories. Specifically, a novel cross-query attention (XQA) module is proposed to bilaterally learn the cross-dependencies between the two pose sequences tailored for this situation. Additionally, we introduce and build a proxy entity to bridge the involved persons, which cooperates with our proposed XQA module and subtly controls the bidirectional information flows, acting as a motion intermediary. We then adapt these designs to a Transformer-based architecture and devise a simple yet effective end-to-end framework called proxy-bridged game Transformer (PGformer) for multi-person interactive motion prediction. The effectiveness of our method has been evaluated on the challenging ExPI dataset, which involves highly interactive actions. We show that our PGformer consistently outperforms the state-of-the-art methods in both short- and long-term predictions by a large margin. Besides, our approach can also be compatible with the weakly interacted CMU-Mocap and MuPoTS-3D datasets and achieve encouraging results. Our code will become publicly available upon acceptance

    Odd-Char Multivariate Hidden Field Equations

    Get PDF
    We present a multivariate version of Hidden Field Equations (HFE) over a finite field of odd characteristic, with an extra ``embedding\u27\u27 modifier. Combining these known ideas makes our new MPKC (multivariate public key cryptosystem) more efficient and scalable than any other extant multivariate encryption scheme. Switching to odd characteristics in HFE-like schemes affects how an attacker can make use of field equations. Extensive empirical tests (using MAGMA-2.14, the best commercially available \mathbold{F_4} implementation) suggests that our new construction is indeed secure against algebraic attacks using Gröbner Basis algorithms. The ``embedding\u27\u27 serves both to narrow down choices of pre-images and to guard against a possible Kipnis-Shamir type (rank-based) attack. We may hence reasonably argue that for practical sizes, prior attacks take exponential time. We demonstrate that our construction is in fact efficient by implementing practical-sized examples of our ``odd-char HFE\u27\u27 with 3 variables (``THFE\u27\u27) over GF(31)\mathrm{GF}(31). To be precise, our preliminary THFE implementation is 15×15\times--20×20\times the speed of RSA-1024
    corecore